91 research outputs found
BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings
In this paper, we propose a bidimensional attention based recursive
autoencoder (BattRAE) to integrate clues and sourcetarget interactions at
multiple levels of granularity into bilingual phrase representations. We employ
recursive autoencoders to generate tree structures of phrases with embeddings
at different levels of granularity (e.g., words, sub-phrases and phrases). Over
these embeddings on the source and target side, we introduce a bidimensional
attention network to learn their interactions encoded in a bidimensional
attention matrix, from which we extract two soft attention weight distributions
simultaneously. These weight distributions enable BattRAE to generate
compositive phrase representations via convolution. Based on the learned phrase
representations, we further use a bilinear neural model, trained via a
max-margin method, to measure bilingual semantic similarity. To evaluate the
effectiveness of BattRAE, we incorporate this semantic similarity as an
additional feature into a state-of-the-art SMT system. Extensive experiments on
NIST Chinese-English test sets show that our model achieves a substantial
improvement of up to 1.63 BLEU points on average over the baseline.Comment: 7 pages, accepted by AAAI 201
Translating Phrases in Neural Machine Translation
Phrases play an important role in natural language understanding and machine
translation (Sag et al., 2002; Villavicencio et al., 2005). However, it is
difficult to integrate them into current neural machine translation (NMT) which
reads and generates sentences word by word. In this work, we propose a method
to translate phrases in NMT by integrating a phrase memory storing target
phrases from a phrase-based statistical machine translation (SMT) system into
the encoder-decoder architecture of NMT. At each decoding step, the phrase
memory is first re-written by the SMT model, which dynamically generates
relevant target phrases with contextual information provided by the NMT model.
Then the proposed model reads the phrase memory to make probability estimations
for all phrases in the phrase memory. If phrase generation is carried on, the
NMT decoder selects an appropriate phrase from the memory to perform phrase
translation and updates its decoding state by consuming the words in the
selected phrase. Otherwise, the NMT decoder generates a word from the
vocabulary as the general NMT decoder does. Experiment results on the Chinese
to English translation show that the proposed model achieves significant
improvements over the baseline on various test sets.Comment: Accepted by EMNLP 201
CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models
Holistically measuring societal biases of large language models is crucial
for detecting and reducing ethical risks in highly capable AI models. In this
work, we present a Chinese Bias Benchmark dataset that consists of over 100K
questions jointly constructed by human experts and generative language models,
covering stereotypes and societal biases in 14 social dimensions related to
Chinese culture and values. The curation process contains 4 essential steps:
bias identification via extensive literature review, ambiguous context
generation, AI-assisted disambiguous context generation, snd manual review \&
recomposition. The testing instances in the dataset are automatically derived
from 3K+ high-quality templates manually authored with stringent quality
control. The dataset exhibits wide coverage and high diversity. Extensive
experiments demonstrate the effectiveness of the dataset in detecting model
bias, with all 10 publicly available Chinese large language models exhibiting
strong bias in certain categories. Additionally, we observe from our
experiments that fine-tuned models could, to a certain extent, heed
instructions and avoid generating outputs that are morally harmful in some
types, in the way of "moral self-correction". Our dataset and results are
publicly available at
\href{https://github.com/YFHuangxxxx/CBBQ}{https://github.com/YFHuangxxxx/CBBQ},
offering debiasing research opportunities to a widened community
- …